Distributed Monitoring of Frequent Items

نویسندگان

  • Robert Fuller
  • Mehmed M. Kantardzic
چکیده

Monitoring frequently occuring items is a recurring task in a variety of applications. Although a number of solutions have been proposed there has been few to address the problem in a distributed networked environment. Most past solutions relied upon approximating results to lower communication overhead. In this paper we introduce a new algorithm designed for continuously tracking frequent items over distributed data streams providing either exact or approximate answers. We tested the efficiency of our method using two real-world data sets. The results indicated significant reduction in communication cost when compared to näıve approaches and an existing efficient algorithm called Top-K Monitoring. Since our method does not rely upon approximations to reduce communication overhead and is explicitly designed for tracking frequent items, our method also shows increased quality in its tracking results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monitoring frequent items over distributed data streams

MONITORING FREQUENT ITEMS OVER DISTRIBUTED DATA STREAMS Robert H. Fuller April 3, 2007 Many important applications require the discovery of items which have occurred frequently. Knowledge of these items is commonly used in anomaly detection and network monitoring tasks. Effective solutions for this problem focus mainly on reducing memory requirements in a centralized environment. These solution...

متن کامل

Test-taking Strategies Used by Successful Iranian Male and Female University Entrance Exam EFL Applicants

This study aimed at identifying the most frequent test-taking strategies used by successful Iranian male and female university entrance exam EFL applicants. To this end, 150 English major male and female freshman students who got admission to three reputable state universities of Isfahan, Shiraz, and Tehran were selected conveniently and purposively. The model used in this study was developed b...

متن کامل

Weighted Itemset Mining from Bigdata using Hadoop

Data items have been extracted using an empirical data mining technique called frequent itemset mining. In majority of theapplication contexts items are enriched with weights. Pushing an item weights into the itemset extraction process, i.e., mining weighted itemsets rather than traditional itemsets, is an appealing research direction. Although many efficient weighteditemset mining algorithms a...

متن کامل

ProFID: Practical Frequent Item Set Discovery in Peer-to-Peer Networks

This study addresses the problem of discovering frequent items in unstructured P2P networks. This problem is relevant for several distributed services such as cache management, data replication, sensor networks and security. We make three contributions to the current state of the art. First, we propose a fully distributed Protocol for Frequent Item Set Discovery (ProFID) where the result is pro...

متن کامل

CoMMEDIA: Separating Scaramouche from Harlequin to Accurately Estimate Items Frequency in Distributed Data Streams

In this paper, we investigate the problem of estimating the number of times data items that recur in very large distributed data streams. We present an alternative approach to the well-known CountMin Sketch in order to reduce the impact of collisions on the accuracy of the estimation. We propose to decrease, for each concerned item, the over-estimation that results from these collisions. Our sk...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Trans. MLDM

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2008